Freight train scheduling via decentralised multi-agent deep reinforcement learning

Bretas, A. M. C.; Mendes, A.; Chalup, S.; Jackson, M.; Clement, R.; Sanhueza, C.

Title: Freight train scheduling via decentralised multi-agent deep reinforcement learning
Creator: Bretas, A. M. C.; Mendes, A.; Chalup, S.; Jackson, M.; Clement, R.; Sanhueza, C.
Relation: The 24th International Congress on Modelling and Simulation (MODSIM2021). Proceedings of The 24th International Congress on Modelling and Simulation (MODSIM2021) (Sydney, NSW 05-10 December, 2021) p. 743-749
Relation: https://www.mssanz.org.au/modsim2021/papers/M1/bretas.pdf
Publisher: Modelling and Simulation Society of Australia and New Zealand Inc
Resource Type: conference paper
Date: 2021
Description: Rail traffic planning and scheduling problems have been challenging academy and industry for a few decades. Specifically, problems in the short term and real-time horizons deal with simultaneous decision-making of trains, stations and terminals. Approaches focused on decentralised decision-making have been successful in delivering real-world committed solutions. This work focuses on decentralised real-time decision-making in a closed freight rail network and applies multi-agent deep reinforcement learning (MADRL) to find efficient timetables. We apply the MADRL model to solve the traffic decisions arising in the Hunter Valley Coal Chain (HVCC) in New South Wales, Australia. The approach uses the same simulation model currently in use for capacity planning of the system, thus allowing tests with real data. The environment is modelled as a decentralised, partially observed Markov decision process (dec-POMDP), where the train, load point, and dump station agents decide upon train movements based on local observations. The observations follow a novel state encoding strategy for rail traffic management composed of nine layers. We benefit from this strategy to apply a decentralised execution with a centralised learning approach through proximal policy optimisation. The experiments revealed a significant performance improvement for the ten instances tested, which reproduce the challenges faced in the HVCC operations. The approach is suitable for varied levels of rail network complexity, generating efficient solutions without scaling issues. The MADRL outperformed the heuristic in use by HVCC's simulation model and a high-performance genetic algorithm in all instances, reaching performance improvements of up to 72.00% and 47.42%, respectively. Therefore, the framework with the MADRL and the simulation model allows its application with real world instances in an efficient and reliable way. These results show the method's consistency and draw a safe path towards a decentralised rail traffic management system.
Subject: multi-agent deep reinforcement learning; simulation-based machine learning; decision-making; rail traffic management; train scheduling
Identifier: http://hdl.handle.net/1959.13/1512726
Identifier: uon:56668
Identifier: ISBN:9780987214386
Language: eng
Reviewed

Hits: 72
Visitors: 72
Downloads: 0

		Thumbnail	File	Description	Size	Format